Goto

Collaborating Authors

 limit distribution


c9e1074f5b3f9fc8ea15d152add07294-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for their useful feedback and their time. We have corrected all the minor comments, as suggested. We now provide specific answers to each reviewer below. We thank the reviewer for their positive evaluation of our work and their comments. "All the theoretical contributions seem to me a bit marginal" Since the Sliced-Wasserstein distance is an average of one-6 We will explain these observations more explicitly to clarify our contributions.



Asymptotic Guarantees for Generative Modeling Based on the Smooth Wasserstein Distance

Neural Information Processing Systems

Minimum distance estimation (MDE) gained recent attention as a formulation of (implicit) generative modeling. It considers minimizing, over model parameters, a statistical distance between the empirical data distribut ion and the model. This formulation lends itself well to theoretical analysis, but typ ical results are hindered by the curse of dimensionality.


c9e1074f5b3f9fc8ea15d152add07294-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for their useful feedback and their time. We have corrected all the minor comments, as suggested. We now provide specific answers to each reviewer below. We thank the reviewer for their positive evaluation of our work and their comments. "All the theoretical contributions seem to me a bit marginal" Since the Sliced-Wasserstein distance is an average of one-6 We will explain these observations more explicitly to clarify our contributions.


Backdoor Attacks on Discrete Graph Diffusion Models

Wang, Jiawen, Karim, Samin, Hong, Yuan, Wang, Binghui

arXiv.org Artificial Intelligence

Diffusion models are powerful generative models in continuous data domains such as image and video data. Discrete graph diffusion models (DGDMs) have recently extended them for graph generation, which are crucial in fields like molecule and protein modeling, and obtained the SOTA performance. However, it is risky to deploy DGDMs for safety-critical applications (e.g., drug discovery) without understanding their security vulnerabilities. In this work, we perform the first study on graph diffusion models against backdoor attacks, a severe attack that manipulates both the training and inference/generation phases in graph diffusion models. We first define the threat model, under which we design the attack such that the backdoored graph diffusion model can generate 1) high-quality graphs without backdoor activation, 2) effective, stealthy, and persistent backdoored graphs with backdoor activation, and 3) graphs that are permutation invariant and exchangeable--two core properties in graph generative models. 1) and 2) are validated via empirical evaluations without and with backdoor defenses, while 3) is validated via theoretical results.

  Country:
  Genre: Research Report (0.64)

Personalized PageRank Graph Attention Networks

Choi, Julie

arXiv.org Artificial Intelligence

There has been a rising interest in graph neural networks (GNNs) for representation learning over the past few years. GNNs provide a general and efficient framework to learn from graph-structured data. However, GNNs typically only use the information of a very limited neighborhood for each node to avoid over-smoothing. A larger neighborhood would be desirable to provide the model with more information. In this work, we incorporate the limit distribution of Personalized PageRank (PPR) into graph attention networks (GATs) to reflect the larger neighbor information without introducing over-smoothing. Intuitively, message aggregation based on Personalized PageRank corresponds to infinitely many neighborhood aggregation layers. We show that our models outperform a variety of baseline models for four widely used benchmark datasets. Our implementation is publicly available online.


From Smooth Wasserstein Distance to Dual Sobolev Norm: Empirical Approximation and Statistical Applications

Nietert, Sloan, Goldfeld, Ziv, Kato, Kengo

arXiv.org Machine Learning

Statistical distances, i.e., discrepancy measures between probability distributions, are ubiquitous in probability theory, statistics and machine learning. To combat the curse of dimensionality when estimating these distances from data, recent work has proposed smoothing out local irregularities in the measured distributions via convolution with a Gaussian kernel. Motivated by the scalability of the smooth framework to high dimensions, we conduct an in-depth study of the structural and statistical behavior of the Gaussian-smoothed $p$-Wasserstein distance $\mathsf{W}_p^{(\sigma)}$, for arbitrary $p\geq 1$. We start by showing that $\mathsf{W}_p^{(\sigma)}$ admits a metric structure that is topologically equivalent to classic $\mathsf{W}_p$ and is stable with respect to perturbations in $\sigma$. Moving to statistical questions, we explore the asymptotic properties of $\mathsf{W}_p^{(\sigma)}(\hat{\mu}_n,\mu)$, where $\hat{\mu}_n$ is the empirical distribution of $n$ i.i.d. samples from $\mu$. To that end, we prove that $\mathsf{W}_p^{(\sigma)}$ is controlled by a $p$th order smooth dual Sobolev norm $\mathsf{d}_p^{(\sigma)}$. Since $\mathsf{d}_p^{(\sigma)}(\hat{\mu}_n,\mu)$ coincides with the supremum of an empirical process indexed by Gaussian-smoothed Sobolev functions, it lends itself well to analysis via empirical process theory. We derive the limit distribution of $\sqrt{n}\mathsf{d}_p^{(\sigma)}(\hat{\mu}_n,\mu)$ in all dimensions $d$, when $\mu$ is sub-Gaussian. Through the aforementioned bound, this implies a parametric empirical convergence rate of $n^{-1/2}$ for $\mathsf{W}_p^{(\sigma)}$, contrasting the $n^{-1/d}$ rate for unsmoothed $\mathsf{W}_p$ when $d \geq 3$. As applications, we provide asymptotic guarantees for two-sample testing and minimum distance estimation. When $p=2$, we further show that $\mathsf{d}_2^{(\sigma)}$ can be expressed as a maximum mean discrepancy.


Properties of the Stochastic Approximation EM Algorithm with Mini-batch Sampling

Kuhn, Estelle, Matias, Catherine, Rebafka, Tabea

arXiv.org Machine Learning

To speed up convergence a mini-batch version of the Monte Carlo Markov Chain Stochastic Approximation Expectation Maximization (MCMC-SAEM) algorithm for general latent variable models is proposed. For exponential models the algorithm is shown to be convergent under classical conditions as the number of iterations increases. Numerical experiments illustrate the performance of the mini-batch algorithm in various models. In particular, we highlight that an appropriate choice of the mini-batch size results in a tremendous speed-up of the convergence of the sequence of estimators generated by the algorithm. Moreover, insights on the effect of the mini-batch size on the limit distribution are presented.


A Reproducing Kernel Hilbert Space log-rank test for the two-sample problem

Fernandez, Tamara, Rivera, Nicolas

arXiv.org Machine Learning

Weighted log-rank tests are arguably the most widely used tests by practitioners for the two-sample problem in the context of right-censored data. Many approaches have been considered to make weighted log-rank tests more robust against a broader family of alternatives, among them: considering linear combinations of weighted log-rank tests or taking the maximum between a finite collection of them. In this paper, we propose as test-statistic the supremum of a collection of (potentially infinite) weight-indexed log-rank tests where the index space is the unit ball of a reproducing kernel Hilbert space (RKHS). By using the good properties of the RKHS's we provide an exact and simple evaluation of the test-statistic and establish relationships between previous tests in the literature. Additionally, we show that for a special family of RKHS's, the proposed test is omnibus. We finalise by performing an empirical evaluation of the proposed methodology and show an application to a real data scenario.